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I.  INTRODUCTION 


In  prediction,  data  generated  by  some  stochastic  process  are  deduced  from 
past  observations.  Given  a  well-known  such  process,  the  optimal  mean-squared 
predictor  is  the  conditional  mean,  which  is  generally  a  complicated  function 
of  the  past  observations.  Linear  prediction  operations  are  then  widely  used, 
due  to  their  simplicity,  and  the  classical  theory  of  linear  prediction  for 
weakly  stationary  discrete-time  processes  is  mainly  due  to  Wiener,  [19],  and 
Kolmogorov,  [11], [12].  However,  such  linear  operations  are  notoriously  un¬ 
stable  in  the  presence  of  contaminations  due  to  data  outliers,  (see  Huder,  [9], 
and  Hampel,  [7]),  while  the  occurence  of  such  outliers  is  a  phenomenon  frequently 
observed  in  practice.  In  this  paper,  we  develop  and  analyze  a  sequence  of 
outlier  resistant  prediction  operations.  Our  presentation  combines  the 
theories  of  saddle  point  games  and  qualitative  robustness,  (for  the  latter  see 
Boente  et  al,  [1],  Cox  [2],  Hampel,  [7],  Papantoni-Kazakos  and  Gray  [13],  and 
Papantoni-Kazakos,  [14],  [15],  [16],).  Similar  approach  was  used  by  Tsaknakis, 

[18],  for  the  development  of  outlier  resistant  filtering  and  smoothing  operations. 

Considerable  effort  has  been  dedicated  to  the  development  of  minimax 
linear  prodictors,  in  cases  when  the  spectral  density  of  the  process  is  not  well 
defined,  but  is  instead  a  member  of  some  compact  class,  (see  Franke,  [3],  Franke 
and  Poor,  [4],  Hosoya,  [8],  Kassam  and  Poor,  [10],  and  Tsaknakis  et  al,  [17]). 

Such  predictors  are  highly  sensitive  to  data  outliers,  however. 

In  this  paper,  one-step  prediction  is  considered,  and  the  organization  is 
as  follows:  In  section  II,  we  present  formalization  of  the  problem  and  we  de¬ 
fine  important  performance  criteria  for  outlier  resistant  operations.  In  section 
III,  we  develop  outlier  resistant  prediction  operations  and  we  studv  their  asymptotic 
performance.  In  section  IV,  we  examine  the  special  case  of  first  order  autoregressive 
nominal  processes.  In  section  V,  we  draw  some  cone  1  visions. 


2 


II.  PRELIMINARIES 


Let  R  be  the  real  line,  and  let  6  be  the  usual  Borel  a- field  on  R.  Let  R 


be  the  one-sided  sequence  space,  and  let  o  be  the  Borel  a-field  on  R  that  Is 


generated  by  the  product  topology  on  R  .  We  consider  a  real-valued  discrete- 


tine  process,  {X  ,  l<n<°°},  whose  measure  w  is  known  and  is  defined  on  5 


none  {X  .  l<n<°»}  the  nominal  process,  and  we  denote  by  {x_,  l<n«*»}  data  realiza- 
n  —  n 


tions  generated  by  it.  Let  *  x^(x^  ^  denote  the  optimal  one-step  mean- 


squared  prediction  operation,  given  the  sequence  realization  x"  ^  •  {x^. 


when  {x  ,  l<n<®}  is  generated  by  the  nominal  process.  Then  if  g  *  g  (x”  ) 

o  —  n  n  i 


denotes  some  scalar  real-valued  function  on  the  sequence  x^  ,  we  have: 


WV  '  lnf  VV*,* 

®n 


v*r>  -  eu 


;  where  E  {  }  denotes  expectation  with  respect  to  the  measure  Uq,  where 


X1  "  •  and  where* 


«>„•«»> 4  *  > 


The  expression  in  (3)  is  called  the  one-step  prediction  error  induced  by 


a  at  u  .  Let  L  denote  the  class  of  all  the  scalar  real-valued  linear  functions 
6n  o  n 


L  aL,  n— 1. 


defined  on  R  .  Let  then  i  m  £  (x,  )  be  such  that: 


n  n  1 


e(u  ,2^)  -  inf  «<Un»s!p 

non  .  non 

«nCl 
n  n-1 


Then,  x  is  called  the  optimal  linear  one-step  mean  squared  predictor  at 
n 


given  the  sequence  realization  x^  ,  and  generally, 


•  *  •  •  •  *  •  v"  *.*  s"  %"  V 
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e  (U  .x  )  <  e  (y  ,x  ) 
n  o  n  ~  n  o  n 


If  the  measure  y  is  Gaussian,  then  x  (x"  *)  -  x^(x"  *) ,  Vn,  and  (5)  is 
o  n  x  n  x 


then  satisfied  with  equality  for  all  n.  If  UQ  is  non  Gaussian,  then  (5)  is 


generally  a  strict  Inequality. 


The  above  summary  corresponds  to  parametric  one-step  prediction;  that  is. 


it  corresponds  to  the  case  where  the  measure  that  generates  the  data  sequences 


is  known.  In  this  paper,  we  are  concerned  with  the  outlier  model.  Then,  the 


observation  process  {Y  ,  l£n<°°}  is  generated  by  three  mutually  independent  processes, 


the  nominal  process  (X  ,  l<n<°°}  and  two  i.i.d.  processes  (V  ,  l<n<0°}  and 


{Z  ,  l<n<°®} ,  as  follows: 
n  — 


(l-V)X  +  V  Z  ,  n-1,2,... 
n  n  n  n 


;  where  the  common  distribution  of  the  variables  Zq,  l£n.  Is  unknown,  and 


where  tv^,  l<n<°°}  is  a  binary  process.  In  particular,  for  some  given  e :  0£e<l, 


the  latter  process  is  such  that: 


P(V  -  0) 
k 


P(Vk  -  1) 


In  the  outlier  model  in  (6),  {Z^,  l<n<°°}  is  called  the  contaminating  process. 


and  l£n<°°}  determines  the  contamination  law.  In  the  presence  of  the  latter 


model,  the  objective  is  predicition  of  the  nominal  datura  x^,  given  the  observation 


sequence  ,  for  all  n,  and  the  problem  formalization  is  then  clearly  non- 


pararaetric.  Let  V  denote  the  measure  of  the  observation  process,  and  let  ^8n^j<n< 


denote  a  sequence  of  one-step  predictors,  where  g^  *  g^Cy^  ^  •  Let  us  t^en  define. 


■  Eu(1X„-*n('rr1»  1 


m 


I'V 


v  v 


‘  ^  ->v 


WWW 


A 


In  (8),  e  (p,g  )  is  the  mean-squared  error  induced  by  the  predictor  g  ,  when  the 
n  n  n 

measure  of  the  observation  process  (Y  ,  l<n<°"}  is  p,  and  where  X  is  generated 

n  n 

by  the  nominal  process  whose  measure  is  p  •  Clearly,  e  (p  ,g  )  is  then  as  in  (3), 

o  non 

and  it  represents  the  mean-squared  performance  of  the  predictor  g  at  the  nominal 

n 

measure  p  ,  (that  is,  when  outliers  are  absent), 
o 

Our  objective  is  to  design  a  sequence  {g^} ^<n<0O  of  predictors  whose  mean- 
squared  performance  is  stable  in  the  presence  of  variations  in  the  measure  p  of 
the  observation  process  (Y^,  l<n<0°}.  This  stability  corresponds  to  qualitative 
robustness,  and  is  defined  as  follows: 

Given  n>0,  there  exists  5>0,  such  that: 

np(Po.P)  <5  implies  len(P0.8n>  -  en (^ » 8n>  I <T1  •  Vn 

In  the  above  definition,  11^  denotes  Prohorov  distance  with  an  appropriate 

distortion  measure  p  on  data  sequences,  and  sequences  (g  }  of  operations  that 

n 

satisfy  this  stability  are  called  qualitatively  robust  at  the  measure  Pq.  As 
found  first  in  [13],  and  later  in  [1],  [1A],  and  [16],  for  the  sequence  (gn)  to 
be  qualitatively  robust,  polntwise  continuity  and  asymptotic  continuity  in 
conjuction  with  boundness,  are  sufficient.  In  particular,  it  is  sufficient  that 
gn  is  bounded  for  all  n,  and: 

(A)  Given  finite  n,  given  n>0,  given  x”,  there  exists  6>0,  such  that, 

yl  :  Vvyl)  ’  n"1  101,1168  lVl(x?)-Sn+l(yl)l<n- 

(B)  Given  p  stationary,  given  £>0,  n>0,  there  exist  integers  n  ,  ra, 

o  o 

some  6>0,  and  for  each  n>n  some  A°eRn  with  U  (An)>l-n,  such  that  for 

o  o 

each  xneA°  and  y°  such  that  inf  [a:  #[i  :  ^(x^111  \  y**™  S>«l£na}<5 
it  is  implied  that  I  8n+l^xl>"gn+1^>rl)  I  <T- * 


A 


Given  a  sequence  {g^}  of  predictors  which  is  qualitatively  robust  at  the 

nominal  measure  U  ,  its  important  quantitative  performance  criteria  are:  (1)  Its 
o 

asymptotic  mean-squared  performance  at  the  nominal  measure,  lim  sup  en^0»8n^ 

n-*» 

(2)  Its  breakdown  point.  (3)  Its  influence  function.  The  breakdown  point  and 
the  influence  function  represent  measures  of  resistance  to  outliers,  and  their 


definitions  are  given  below. 

Consider  the  model  in  (6),  and  let  then  {z^}  be  a  deterministic  process 

with  amplitude  w;  that  is,  P(Z  *w)=l.  Let  then  be  the  measure  of  the 

n  u»w 

observation  process  (Y.).  Given  a  sequence  {g  )  of  predictors,  we  then  define: 

n  n 

Influence  Function  of  the  sequence  {g  }: 

- -  Q 


e(uc.w’g)  -  e<V8) 

I  (w)  -  lim  - 1 - - - 

g  e-K)  e 


;  where. 


e(li,g)  =*  lim  sup  e  (jj.g  ) 
__  n  n 
n-*» 


Breakdown  point  of  the  sequence  (g^) : 


^  A  2 

e  *  sup  {  e:  e(p  ,g)  <  lim  sup  E,  (x  }} 
8  e*°°  ~  n-*»  V0  n 


;  where  e(p,g)  is  defined  as  in  (10). 


We  note  that  the  breakdown  point  is  the  maximum  frequency  of  independent, 

infinite-amplitude  outliers  that  the  prediction  sequence  can  tolerate  asymptotically , 

without  becoming  useless,  (that  is,  before  the  observation  sequences  provide  no 

information  about  the  next  process  datum).  The  influence  function  represents 

the  slope  of  the  function  e(y  ,g)  -  e(y  ,g)  =  F  (w) ,  at  the  e=0  point.  F  (w) 

e,w  o  e,g  e,g 

corresponds  to  the  asymptotic  mean-squared  error  increase  induced  by  the  prediction 


sequence  (g  },  when  from  absence  of  outliers  the  environment  shifts  to  e-frequency 
n 


and  w-amplitude  outlier  occurence. 


6 


The  outlier  model  in  (6)  can  be  generalized  to  i.i.d.  sequences  of  m-size 


blocks  of  outliers,  as  follows: 


■  "-V^i  +  :  k=1-2' 


;  where  the  sequence  {V  }  is  as  in  (7),  and  where  the  vector  random  variables 

n 


{Z7?  are  i.i.d.  with  unknown  distribution.  Let  u  denote  the  measure 

(k-l)mfl  e,w,m 


of  the  observation  process  {Y^},  when  the  model  in  (12)  is  present,  and  when 


P(Zn*w)=l.  Then,  given  a  sequence  {g^}  of  predictors,  and  defining  e(y,g)  as 


in  (10),  the  breakdown  point,  £  ,  and  the  influence  function,  I  (w) ,  that 

g.®  g»® 


correspond  to  the  outlier  model  in  (12)  are  defined  as  follows: 


^  A 

£  m  =  sup  {  e:  e(u  ,g)  <  lim  sup  E  {X'"}} 
g.®  a+oo 


A  e(tJc-  „  _»g)  "  e(u  ,g) 

I  (w)  £  lim - £-^- . . . 2 - 

*•“  e-K)  C 


III.  OUTLIER  RESISTANT  PREDICTION  OPERATIONS 


We  consider  a  stationary,  zero  mean,  real-valued  process  {Xn,l£n<“>},  with 


2  2 

measure  y  ,  and  E  {X  }  **  a  <°°.  We  also  consider  the  outlier  model  in  (12)  for 
°  n 


the  observation  process  (Y^ , l<n<°°} .  We  concentrate  on  the  design  of  qualitatively 


robust  and  outlier  resistant  sequences  {g^}  of  one-step  predictors  for  the  process 


{ X^ , l<n<°°} .  Our  methodology  involves  two  steps:  (1)  A  saddle-point  game  formal¬ 


ization  and  solution  for  the  predictors  g  :  2<n<m+l.  (2)  A  qualitatively  robust 

n  —  — ■ 


generalization  of  the  solutions  in  step  1,  for  the  predictors  g :  n>m+l. 


In  the  sequel,  we  will  assume  that  both  the  nominal  and  the  contaminating 


processes  are  absolutely  continuous.  We  will  then  denote  by  f^(x^)  the  density 


function  induced  by  the  nominal  process  at  the  vector  point  x^;  we  will  denote 


by  f(y^)  the  density  function  Induced  by  the  observation  process  at  the  vector 


point  y!?.  We  note  that  then,  for  n  :  2<n<ra+l,  the  class,  F  ,  of  density  functions 
1  -  n 


induced  by  the  model  in  (12)  is  as  follows: 


(WWW 


F 

n 


={f:f(y 


n-1 

1 


)-(l-e)fo(y""1)  >  0;V  y^eR0-1, 


/  f (y?-i)dy?_1=l} 

V-1 

(15) 


Construction  of  Prediction  Operations  -  Step  1 

Let  us  consider  the  model  in  (12)  and  one-step  prediction  based  on  observation 

sequences  y"  \  with  n  :  2<n<m+l.  Given  such  an  n,  we  consider  the  following 

saddle  point  game,  where  F  is  as  in  (15): 

n 

*  * 

Find  a  pair,  (f  ,g^),  of  an  observation  density  function  and  an  one-step 

*  r- 

predictor,  such  that  f  eF  ,  and: 


VfCFn  5  en<£  1  eJf*>0  1  en<f*.gJ  ?  V8, 


n 


n 


n 


(16) 


In  (16),  the  errors  e^Cf.g^)  are  as  in  (8),  where  the  measure,  p,  has  been 
substituted  by  the  corresponding  density  function,  f. 

Consider  a  pair,  (f^,g'),  of  an  observation  density  and  a  prediction  operation, 
such  that: 


(f'.g'')  :  e  (£'  ,g')=*  sup  inf  e  (f,g  ) 
n  n  n  feF  g  n  n 

n  &n 


(17) 


n-1. 


iiom  the  results  in  [15]  we  then  conclude  that  if  the  operation  g^-g'(y^  )  is 


*  * 


pointwise  continuous  and  bounded,  then  (f  ,g  )  =  (f  ,g  ),  and  the  pair  is  a  unique 


solution  of  the  game  in  (16).  We  now  present  a  theorem  whose  proof  is  in  the  Appendix. 


Theorem  1 

Let  the  nominal  process  be  zero  mean  Gaussian.  Let  then  denote  the 

ri  1.  T  ji— 

n-dimensional  autocovariance  matrix  of  this  process,  and  let  mQ(y1  )  = 

denote  the  optimal  at  the  Gaussian  nominal  process  one-step  predictor,  when  the 

observation  sequence  is  y^  Let  n:2<n^m+l.  Then,  the  pair  (f  >§n)  1°  (17)  Is  as 


follows : 


T  —1/2 

8n(yl  1}  =  -oO’  min(1’Xn{(yl1)  Pn-lyl"1}  > 

-  (l-ejf^yj"1)-  maxd.X;1^-1)1?^^"1}172) 


;  where  X^  is  unique,  and  such  that: 


k- :  L  r(y"'i>< 


Since  the  operation  in  (18)  is  pointwise  continuous  and  bounded.  (f'.g')=(f*  e*) 

6n  ,6n 

and  the  pair  is  a  unique  solution  of  the  game  in  (16). 


When  the  nominal  process  is  non  Gaussian,  the  operation  g'  in  (17)  is 

n 

generally  not  pointwise  continuous;  thus,  there  is  no  guarantee  then  that  it  will 

satisfy  the  game  in  (16),  and  it  is  generally  qualitatively  nonrobust.  However, 

drawing  from  linear  prediction  in  the  absence  of  outliers,  we  will  adopt  the 

operations  in  Theorem  1,  for  non  Gaussian  nominal  processes  as  well.  Specifically: 

Let  the  nominal  process  be  stationary  and  zero  mean,  with  n-diraensional 

autocovariance  matrix  P  .  Let  ra  (y°  X)=B^  , y?  X  denote  the  optimal  at  the 

n  o  'l  n-1  1  r 

nominal  process  linear  one-step  predictor  when  the  observation  sequence 

is  y^  X.  Let  fG  denote  densities  of  the  Gaussian  process  whose  power  spectral 

density  is  the  same  a3  that  of  the  nominal  process,  and  whose  mean  is  zero. 

Then,  in  the  presence  of  the  outlier  model  in  (12),  and  for  n  :  2<n<m+l,  we 

adopt  the  following  one-step  prediction  operation: 

T  -1/2 

gn(yr1)  -  -oO  min(1’Xn{(yl’1)  Pn-lyl~1}  > 

(21) 

f  i  -  ,  T  ,  .  1/2  . 

Xn:  /  fG^yl~  max(1*An  ^yl~  *  Pn-lyl  *  >  * 

yDn-x 


We  note  that  for  e=Q,  the  value  of  X  is  infinity  and  the  operation  g  becomes 

n  n 

identical  to  the  optimal  at  the  nominal  linear  one-step  predictor.  As  e  increases, 
X^  decreases  raonotonically ,  becoming  zero  at  e=l. 
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Construction  of  Prediction  Operations  -  Step  2 


In  this  part,  we  are  concerned  with  the  construction  of  qualitatively  robust 
prediction  operations,  for  large  dimensionalities  of  observation  sequences.  We 
point  out  that  the  operations  in  (21)  are  qualitatively  robust  for  finite  such 
dimensionalities  only.  Indeed,  they  satisfy  condition  (A)  in  section  II  and  are 
bounded,  but  they  do  not  satisfy  condition  (B).  At  the  same  time,  the  outlier 
model  in  (12)  does  not  allow  for  the  formalization  of  a  saddle  point  game  for 
arbitrary  data  dimensionalities,  even  when  the  nominal  process  is  Gaussian.  We 
will  thus  adopt  an  adhoc  approach. 

Let  {ajn)}15j£n  denote  the  one-step  prediction  coefficients  of  the  nominal 

process,  when  n  observation  data  are  available.  That  is,  if  m  (yn)  denotes  the 

o  1 

optimal  at  the  nominal  linear  one-step  predictor  when  the  observation  sequence  is 


y^,  then: 


-  E  (22) 

i-i 

* 

Let  g^  be  as  in  (21).  Then,  we  propose  the  following  sequence,  {G*} ,  of 

n 

step  predictors: 

Gn(y"  1)  =  ;  for  2<n<m+l. 


r.%n~^  _  Vn(n-1)  8j+l(yl)_gi+l(0’yl  > 

“  yi  ;  ^aj  ”(J)  - 

j=l  aj 


-1.  V'  _(n-D 

aj  fe) - J -  ;  for  n>m+l 

j  =m+l  3m 

;  where  (0,y^  )  denotes  the  sequence  ^ .y£+1 * • • • »° } • 

We  observe  that  the  sequence  {C*}  in  (23)  degenerates  to  the  sequence  of 
the  optimal  at  the  nominal  linear  predictors,  when  in  the  model  in  (12)  e=0 
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(design  in  the  absence  of  outliers).  In  addition,  using  a  similar  proof  as  in 


[15], we  can  easily  show  that  the  sequence  {G^}  is  qualitatively  robust. 


(satisfying  condition  (B)  in  section  II),  if: 
k 

S“P  2  U]k>  I  5  c* 


k  3-1 


Asymptotic  Performance  at  the  Nominal  Process 


In  this  part,  we  focus  on  the  asymptotic  mean-squared  error  induced  by 


the  sequence  {Gn}  in  (23),  at  the  nominal  process.  In  particular,  we  wish  to 


evaluate  e(y  ,G),  where, 
o 


*  ,  * 
e(y  ,G  )  =  lim  sup  e  (y  ,G  ) 
o  non 

n-*» 


Let  e  denote  the  asymptotic  mean-squared  error  induced  by  the  optimal  at 


the  nominal  linear  mean-squared  predictor,  when  the  observations  are  generated 
by  the  nominal  process;  that  is, 

n  2 

A  -  ■  /  \  *• 


e  =  lim  sup  E  {[X  -  £  a  n  Y  ]  } 

n-*»  o  .  .  J  J 

J=1 

Let  us  also  define. 


k 

lim  sup  y 

Ir^oo 

J-l 

ufl 

1/2  I 

I?  J 

(L  . 

*  ,  n. 

8m+-l  (Y1) 

\ 

L m 

(m) 

u  1 

i 

<X 

m 

£]■) 


Then,  we  can  express  the  following  theorem,  whose  proof  is  in  the  Appendix. 
Theorem  2 


Let  the  nominal  process  be  zero  mean  and  stationary,  with  d  <°°, 
m  / 1.  \  o 


lim  sup  ^  | a. 
k-K*>  j  =  l  J 


=  0,  and  E  (X  }<°°.  Then, 
o 


e(y  ,G*)  <  E  {X2 } 


1/2  *  1/2 i  *  * 

ei/  (y  ,G  )  -  e  '  <  d  D 

o  o  1  —  m 
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We  note  that  D  decreases  to  zero,  when  the  parameter  A  in  (21)  goes 
m  m 

* 

to  infinity.  Then,  the  asymptotic  mean-squared  efror,  e(pQ,  G  ),  becomes 

identical  to  the  optimal  at  the  nominal  linear  mean-squared  error,  eQ.  Also, 

*  1/2  2 

D  is  bounded  from  above  by  E  {Y  } ,  for  every  A  value, 
m  u  m  m 

o 

Outlier  Resistance 

In  this  part,  we  are  focusing  on  the  properties  of  the  breakdown  point  and 

the  influence  function  induced  by  the  one-step  prediction  sequence  {G  }  in  (23). 

n 

We  note  that,  as  well  known,  the  breakdown  point  of  the  optimal  at  the  nominal 
linear  one-step  prediction  sequence  is  zero,  and  its  influence  function,  I(w),  is 
quadratic;  thus  unbounded,  (see  [18]).  We  now  state  the  following  theorem,  whose 
proof  is  in  the  Appendix. 

Theorem  3 

Let  the  nominal  process  be  stationary  and  zero  mean,  with 
k  ra 

d  =  lim  sup  |a^|<°°,  E  and  lim  sup  'S""'(af1<^)  =0,  for  every  given 

k-°°  7“J  2  °  k-*»  ,  3 

J*1  J-l 

finite  m.  Let  A^  in  (21)  be  bounded.  Then,  the  sequence  {G^}  in  (23)  has  strictly 

positive  breakdown  point,  and  bounded  influence  function. 

IV.  GAUSSIAN  AUTOREGRESSIVE  NOMINAL  PROCESS 

In  this  section,  we  consider  a  first-order  autoregressive  and  Gaussian  nominal 

process,  and  we  study  then  the  performance  of  the  sequence,  {G*} ,  in  (23),  in 

n 

detail.  In  particular,  let  the  nominal  process  {X  ,l<n<«»}  be  such  that: 

n  — 


X  =  ax  .  +  W 

n  n-1  n 


(31) 


;  where  a<0.5  and  where  the  variables  (W  }  are  i.i.d.  and  zero-mean. 

n 

unit-variance  Gaussian.  The  process  (X  .l^n^.00)  is  then  zero  mean,  and  asymptotically 

2  2  - 1 

stationary  with  lim  sup  E  {X  }  =  (1-u  ) 
n-*®  ^o  n 

Considering  the  above  nominal  process.  Theorem  1  applies,  and  the  operations 
{gn}  =  {gn}  in  (18)  and  (23)  take  here  the  following  asymptotic  form: 
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For  J~.  C-l^-nri-l*  "  ayj *min<1 *A«{yj-m+l  +  L  ^i^i-l^  > 

i*j -m+2 

(32) 


;  where. 


ra 

V  f“ax<1*x‘1{yi +(1_a2)£<yi'ayi-] 

_m  i“2 


)2)  )  * ( 2tt )  2(l-a2)  2 


•exp{ - - — =—  [y^  +(l-a2)Y'  (y.-ay,  ,)  ])dy“-(l-e) 

2(1-0  1  i^2 


For  n~,  G^y"'1)  =  ^(y”^) 


From  the  above  expressions,  we  easily  find  the  following  expressions. 


where  4>(x)  and  <Kx)  denote  respectively  the  distribution  and  the  density 


functions  of  the  zero-mean  and  unit-variance  Gaussian  variable,  at  the  point  x. 

A, 


For  j 


■*“.  g2Cyj)  *  Otyj  mind,  jp— j- 


1  Wrra‘ 


A.:2<t>( - i-^-)  -1  +  2  *(  -  1  -r  )  =  (1-e) 


*  222  A 
For  j-«°,  g3(yj,yj_1)  -  otyj  min(l,A2(y.j_1+(l-a  )  (y^-ay^)  >  ) 


-  u-o'1 


The  functions  that  determine  the  X^  and  X2  values,  in  (36)  and  (38) 


respectively,  are  both  motonically  decreasing  with  increasing  A;  from  ”  to  1; 


thus,  for  e<l,  both  X ^  and  A,,  are  unique.  In  addition,  it  can  be  easily  seen  that 


*2^1' 


We  will  study  the  operations  in  (34),  for  m=l  and  m=2.  That  is,  we  will 


analyze  the  operations  in  (35)  and  (37),  in  terms  of  performance  at  the  nominal. 


breakdown  point,  and  influence  function. 


y  2W  *y  ■V'Vy •  ■ 
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Case  »«1 


Then,  from  (35)  and  (34)  and  for  X^  as  in  (36),  we  obtain: 


For  n-*®,  G  (y”  )  ■  ay  *min(l,  -  ) 

n  i  n-i 


I’n-i1 


(39) 


Then,  we  easily  find: 


e(po,C*)  -  lia  sup  (E  {X2}  -  2E  {xnC*(Yl~1)}  +  Eu  ^n^l  ^1  }) 

*o  Wo  Mo 


-  (1-a2)  {l-a2l24>(X1  Jl^a2)  -l]+2a2(l-«t2)X2[l-<!'(\1 

-  2a2  yJT^2  X1  <p(\1  y[l^2)} 


(40) 


X? 


I  *(w)  -  a2w2*min(l ,  -|)  +  (1-a2)  a2{  [2<t>(X  - 

G  w  1  V 

-  2(l-a2)X2(l-*(X1  y]Y- a2)}  +  2  X14»(X1  ^Jl^i2)} 

e%  -  l-(l-a2)X2{(l-a2)X2  +  (2$^  Jl^a2)  -1]  - 

-  2(l-a2)X2[l-$(X1  pl-a.2)  J  +  2  pL^o.2  X^(X1  ^JTTi2) 


(41) 


-1 


(42) 

;  where  the  expressions  in  (40),  (41),  and  (42)  provide  respectively,  the 
mean-squared  error  at  the  nominal,  the  influence  function,  and  the  breakdown 

point  induced  by  the  operation  in  (39),  when  the  nominal  process  is  as  in  (31). 

*  * 

The  mean-squared  error  e(p  ,C  )  in  (40),  and  the  breakdown  point  e  in  (42) 

°  G 

are  both  convex  functions  of  X^.  In  Figure  1,  we  plot  them  against  X^.  In  Figure 

2,  we  plot  the  influence  function  I  ^(w)  against  |w|. 

G 

Case  m" 2 

Then,  from  (37)  and  (34),  and  for  X^  as  in  (JH),  we  obtain: 

2  -1/2 

For  n-*°°,  G  (y"  1)  -  ay  •min(l,X  {y2  +(y  ,  -  ay  )  (l-u')!  ) 

n  i  n-i  2  n-2  n-i  n-2 


(43) 


1  3 
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Then,  ve  find. 


e(y  ,G*)  -  (1-ct2)  {l-  J2v  a2 

o  v 

V  I-® 


1  *  (w)  -  a2w2*min(l,  -= - 2 - r - =-)  + 

G  ,2  (l+(l-a  )  (1-a)  j 

+  J27a2(l-a2)  {— ~T2  [1-(K--~=t)  ]  +4>(0)  -  <K — -^-r) } 

^  Vl-a 

(45) 

*  2  2  2  2  i -  2  2  2  ^2  ^2 

e  *  -  1-a  X2(a  +  Jltt a  [(1-a  )  +(l-a)  1  — prp  (1-<M— — p)] 

G.2  ^l-a 

+  yf^Ta2  [(l^x2)  +  (l-a)2Jl<j»(0)  -  ♦(— 

•y  1-a 

(46) 

In  (45)  and  (46),  size-tvo  blocks  of  independent  outlier  vectors  have  been 
considered,  as  in  (12)  with  m-2.  As  functions  of  X^,  the  mean-squared  error 
in  (44)  and  the  breakdown  point  in  (46)  behave  respectively  as  those  in  Figure 
1.  Also,  the  influence  function  in  (45)  behaves  similarly  to  that  in  Figure  2. 
We  note,  that  in  the  m=*l  case,  the  found  breakdown  point  and  influence  function 
are  identical  when  size  t  blocks  of  outlier  vectors  are  considered,  and  for 
every  £>1. 

Comparisons 

Let  us  compare  the  operations  derived  for  cases  m= 1  and  m-1.  Since  the 
frequency  of  the  outliers  in  a  given  system  are  it  most  am-rex  in  1 1  c  1  v  known, 
the  thresholds  X^  and  X  in  the  above  operations  an-  sold  ted  idhoclv.  Let  us 

thus  select: 


-♦(0)1) 


ll-'K 


4 


;)1 


1-a 


(44) 
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<• -V - 1*.  «* -V -V . 


15 


For  n-*®,  C*(y"_1)  - 


“y->  *ln<1'  ‘7S?  i,  Ti 

\  l7n-l 


)  ;  for  the  m*l  case 


°yn-l  min^1* 


2  1/2 

{(yn-l‘ayn-2)2  +  ~f} 

1-a 


-)  ;  for  the  m=*2  case 


Let  us  denote  by  I  (w)  ;  m=«i,2,  the  Influence  functions  induced  by  the 

ID 


operations  in  (47),  for  the  cases  m*l  and  m=2  respectively,  when  the  nominal 


process  is  as  in  (31).  Let  us  denote  by  e^  ;  m*!^,  the  corresponding  mean- 


squared  errors  induced  by  the  operations  at  the  nominal  in  (31)  process.  Then, 


modifying  the  tnresholds  appropriately  in  expressions  (40) ,  (41) ,  (44) ,  and  (45)  , 


we  find  after  some  tedious  but  straight  forward  manipulations: 


ere2  =  a2(l-a2)  F(A) 


I1(w)  -  i2(w>  =  -a2 (1-a2)  F(A)  + 


2  2  A2  A2 

+  a  w  {oin(l,  - z — —)  -  min(l,  - z — ; - j — ~) 

(1-a  ;w^  [(1-aT  +  (1-a)  ]w 


;  where. 


=  [2A2  +  A  +  2  ]  [  1  — 4>( A)  1  -  [2A  +  >/2tT](J>(A)  (50) 


The  function  F(A)  is  nonpositive  for  all  positive  A  values,  while  the 


expression  in  the  brackets  of  (49)  is  nonnegative  for  all  A  and  w.  Thus, 


e0  >  e j  and  I^(w)  >  I2(w)  ;  Vw,  VA 


The  inequalities  in  (51)  express  a  tradeoff.  Indeed,  the  operation  for 


m«2  in  (47)  provides  uniformly  better  protection  against  outliers  than  the 


operation  for  m»l,  but  the  former  induces  a  uniformly  higher  mean-squared  error 


at  the  nominal  process  than  the  latter  does.  Generally,  given  each  of  the  operations 


-  ^  w  -  i*.  ^  A.  -  _  A  ■  -  n  -  »*.  AiT.V.  »  .  •  . .  A.  _  >* 


we 


16 


separately,  as  A  increases,  the  mean-squared  error  at  the  nominal  process  decreases, 
but  the  breakdown  point  decreases  as  well,  and  the  influence  function  increases 
uniformly.  Thus,  the  selection  of  one  operation  among  those  in  (47)  and  the 
choice  of  the  threshold  A  in  it,  depend  on  the  desired  tradeoff  between  performance 
at  the  nominal  and  protection  against  outliers. 

V.  CONCLUSIONS 

We  derived  a  class  of  outlier  resistant  prediction  operations.  Those  operations 
are  nonlinear  functions  of  the  observed  data  sequences  and  combine  good  performance 
In  the  absence  of  outliers  with  protection  against  data  outliers.  The  class  in¬ 
volves  a  threshold  parameter  and  a  data  block  size  used  as  a  basis  in  the  construc¬ 
tion.  The  two  parameters  are  involved  in  a  performance  at  the  nominal  process 
versus  outlier  resistance  tradeoff.  The  selection  of  the  threshold  parameter  is 
also  based  on  a  similar  tradeoff.  The  operations  in  our  class  are  qualitatively 
robust. 


1  ft 


7$ 

A 


Figure  1 


Autoregressive  Gaussian  Nominal  Process  and  m=l 


A* 


Figure  2 


Autoregressive  Gaussian  Nominal  Process  and  m=l 


APPENDIX 


Proof  of  Theorem  1 


We  easily  find  that. 


inf  en(f,gn)  =  E  {X*}  -  (1-e)2 
8n 


/’f-1(y;-1)[f0(y^1)mo(yJ-1)]2dy-1 

_n-l 


sup  inf  e  (f,g  )  corresponds  to: 

r  n  n 

feF  g 
n  n 


n  n-1 


(A.  1) 


Applying  calculus  of  variation  on  (A.l),  subject  to  the  constraints 


f  fCyJ'^dy"'1  =  1  and  f(y"  l)  -  (1  -c)  My"*1)  >  0;  V  y"_1eRn‘l,  we  find 

j 


the  solution  in  the  Theorem. 


Proof  of  Theorem  2 


Expression  (29)  is  obvious,  and  is  attained  with  equality  iff  X  =0  in  (21) 

n 


Regarding  expression  (30),  applying  the  Schwartz  inequality  and  using  (22),  we 


obtain , 


%<v<o  ■  eu  ) 

°  2  2 

-  E  {[X  -m  (Y"'1)]  }  +  E([m  (y"_1)  -  C*(y"_1)]  1 
Unol  o  1  nl 

o 


+  2  E  {(X  -m  (Y"-1 ) ] [ra  (y"  S  -  (Ay"  1 )  j  (A. 2) 

U  n  o  1  o  1  n  I 

o 


;  where. 


•ji  v  *.*  /  j  ^  /  A  ^ K 


From  (A 


Also, 


|EU  UXn-«o«;-1)][mo(Yj-‘)-c;;<Y;-1)]}|  1 


n-lx  ‘...n-l. 


1  /,  .2  1/2  , 
i  %  (IX„-.o(Yr  >]  )  E  ((m^Y^^.CnCY"-1)!  } 
o  o 


.2)  and  (A. 3)  /e  obtain: 


(A.  3) 


-  f'l’.-oor1"  >i  u^^-s-o’cy;-1))') 

O  O 

(A.  4) 


.n-1. 


1/2 


E^/2{ [m  (Yj-1)-G*(Yj_1) ]  } 
U  o  1  n  1 
o 


m 


ci(^-iv^i;^(o-Yrl)] 


j-i 


n-l 


V  a^'^Fv  8»ri-l<Yi-iiri-i)~e  (0,Yi-Fri-l>  ]  \  I  . 

"1  F  J  '  a‘">  J)|- 


j*nrfl 


m 


E 

j=l 


(n-l)  1/2 
EP 


j 


n\\  8i+i(Yi 

o  in" 


bL(y,)-8*41(Q.y  {~lr2 


J 


n-l 

E 

j»m+l 


xlh  • 


gm-H(Y1-TO4-l)~8nH-l(Q,Y~i-m-H) 

a(m) 

m 


<  D*  V 

—  m  1—4  '  j 

j=TTt+l 


+  max  E 
l<j<m  Wo 


i:b 


g.1-H^Yl^~S14-l^0,Yl  H  !• 


(.1) 


a . 
J 


Tlf 

'J  |i 


j= 


lim  sup  E1/2{[X  -ra  (Y?"1) ]  }  -  e1/2 
_  __  U  n  o  1  o 


(A. 5) 
(A. 6) 


Applying  (A. 5)  to  (A. 4),  and  taking  limits,  we  obtain  (30). 


Proof  of  Theorem  3 
Let  us  define, 

(k) 

a,  =  lim  sup  a. 

1  k-«°  1 

Then,  we  easily  find. 


I  .  (w)  *»  lim  - *- 

G  e-K) 


e(U<.  ,G*)  -  e(u  ,G*) 


<  -  e(yQ,G*)  +  {X2}  + 


+  8 


r  x  n 

ID 


(m) 

*-am  J 


[£]  H- 

m  J 

I  I  a^a  -  2  ^aiaj 

i+j  K  J  i+J 


<  4 


m 


La 


(m) 


m 


(d  )  +  E  {X2}  +  e(Mo,C)  <“>, 

Po  ° 


since  e(p  ,G)  is  bounded  via  Theorem  2. 
o 

It  can  be  easily  found  that  e(u  ,G*)  equals  e(u  ,G*)  at  e=0,  and  that 

'•i  o 

monotonicaily  increasing  with  increasing  e.  In  addition,  e(y  ,C*)<E  {X2}. 

n  0. 


the  breakdown  point  c  *  is  positive. 

G 


it  is 
Thus , 
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